REMARKS 



In response to the Office Action mailed on October 05, 2005, Applicants 
respectfully request reconsideration. Claim 1, 3-14, 16-26, and 27-42 are now 
pending in this Application. Claims 1,15, and 26 are independent claims and the 
remaining claims are dependent claims. In this Amendment, claims 1,15 and 26 
have been amended, claims 2, 16 and 27 have been cancelled and claims 40-42 
have been added. A version of the claims containing markings to show the 
changes made is included hereinabove. Applicants believe that the claims as 
presented are in condition for allowance. A notice to this affect is respectfully 
requested. 

Rejections under § 1 12 
The Examiner rejected claims 1-39 under 35 U.S.C. §112, second 
paragraph, as being indefinite. Applicants respectfully disagree with the 
Examiner's assertion. Specifically, the Examiner stated that the claims recite 
"instructions" and "at least one function" and that it is not clear as to what types of 
instructions and functions are being referred to. Claim 1 recites "... said 
instructions and data directing said network processor to provide at least one 
function". Claim 2 recites the"... at least one function is selected from the group 
consisting of a network emulator, a network profile generator, a network profile 
capture tool, a packet generation tool, an application traffic generation tool, a 
real-time packet analysis tool, and a network packet capture and analysis tool." 
Therefore, the claims state that the instructions are defined as instructions 
directing the network processor to provide at least one function, and further claim 
2 recited that the function is selected from a group of functions. In this 
amendment claim 2 has been cancelled and the elements of claim 2 added to 
claim 1, such that claim 1 now states the instructions are used to direct the 
network processor to provide at least one function and that the functions 
provided by the network processor include at least one function selected from the 
group consisting of a network emulator, a network profile generator, a network 
profile capture tool, a packet generation tool, an application traffic generation 



U.S. Application No.: 09/920.259 



Attorney Docket No.: EMP04-57 



tool, a real-time packet analysis tool, and a network packet capture and analysis 
tool. Similar changes have also been made to independent claims 15 and 26. 
Accordingly, the Examiner's rejection of claims 1-39 under §112, second 
paragraph is believed to have been rendered moot. 

The Examiner has made a provisional double patenting rejection 
regarding claims 1 - 39 over claims of co-pending application no. 09/920,482. 
Upon an indication of allowance, Applicants will promptly file a terminal 
disclaimer. 

The Examiner has also made a provisional double patenting rejection 
regarding claims 1 - 39 over claims of co-pending application no. 09/920,469. 
Upon an indication of allowance, Applicants will promptly file a terminal 
disclaimer. 

Rejections under §102 

The Examiner rejected claims 1-32 under 35 U.S.C. §102(e) as being 
anticipated by U.S. Patent No. 6,845,352 to Wang et al. (hereinafter Wang). 

Wang recites the use of a Finite State Machine (FSM) as part of a 
simulation engine. At column 5, lines 5-15 Wang states that emulation Manager 
preferably contains a finite state machine (FSM) that maintains a status of the 
emulation. 

In contrast to Wang, claim 1 recites the use of a network processor. As is 

known to one of reasonable skill in the art, a network processor is different from a 

FSM. A network processor is described in the specification as filed at page 6, 

lines 5-10, which states: 

The network processor is typically utilized to perform packet 
processing, cell processing, look-up table processing and queue 
management within a network switch or router. The present 
invention utilizes a network processor in a completely different 
manner by programming the various processors of the network 
processor to provide test system functionality instead of switching 
and routing functionality. 
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Submitted herewith are three articles supporting Applicants position 
regarding the differences between a Finite State Machine (as recited by Wang) 
and a network processor. The articles include a definition of a Finite State 
Machine, obtained from the Wikipedia website at 

http://en.wikipedia.org/wiki/Finite state machine , a white paper by David Husak 
titled "Network Processors: A Definition and Comparison" and a presentation by 
Jacob Engel titled "Network Processor Trends & Design". These references 
clearly show that a network processor is distinguishable from a Finite State 
Machine. 

By way of claim 1 , a network processor, which is conventionally used to 
provide switching and routing functions in a network switch or router, is used in a 
different manner to provide test system functionality. Wang fails to disclose or 
suggest the use of a network processor (instead utilizing a FSM) to perform test 
system functions. 

Therefore, since claim 1 recites using a network processor to perform test 
system functions, while Wang utilizes a FSM, claim 1 is believed allowable over 
Wang. Claims 15 and 26 include similar language as claim 1, and are believed 
allowable over Wang for the same reasons that claim 1 is allowable over Wang. 
Claims 3-14, 17-25 and 27-42 depend from claim 1, 15 or 26 and are believed 
allowable as they depend from a base claim which his believed allowable. 
Accordingly, the rejection of claims 1-39 is believed to have been overcome. 

Claims 40-42 have been added. Support for claims 40-42 can be found in 
the specification as filed at page 6, lines 5-10. Applicants submit that no new 
matter has been added by the addition of claims 40-42, and further that the 
addition of claims 40-42 does not require any additional search by the Examiner. 

The prior art made of record is not believed to disclose or suggest the 
present invention. 

In view of the above, the Examiner's rejections are believed to have been 
overcome, placing claims 1, 3-15, 17-26 and 28-42 in condition for allowance, 
and reconsideration and allowance thereof is respectfully requested. 
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There is no fee required. If the U.S. Patent and Trademark Office deems 
a fee necessary, this fee may be charged to the account of the undersigned, 
Deposit Account No. 50-3735 . 

If the enclosed papers or fees are considered incomplete, the Patent 
Office is respectfully requested to contact the undersigned collect at (508) 616- 
9660, in Westborough, Massachusetts. 

Respectfully submitted, 



Attorney Docket No.: EMP04-57 
Dated: January 5, 2006 




David W. Rouille, Esq. 
Attorney for Applicant(s) 
Registration No.: 40,150 
Chapin Intellectual Property Law, L.L.C. 
Westborough Office Park 
1700 West Park Drive 
Westborough, Massachusetts 01581 
Telephone: (508)616-9660 
Facsimile: (508)616-9661 
Customer No.: 58406 
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Finite state machine 



From Wikipedia, the free encyclopedia. 

A finite state machine (FSM) or finite automaton is a model of 
behavior composed of states, transitions and actions. A state stores 
information about the past, i.e. it reflects the input changes from the 
system start to the present moment. A transition indicates a state 
change and is described by a condition that would need to be fulfilled 
to enable the transition. An action is a description of an activity that is 
to be performed at a given moment. There are several action types: 

Entry action 

execute the action when entering the state 
Exit action 

execute the action when exiting the state 
Input action 

execute the action dependent on present state and input 
conditions 
Transition action 

execute the action when performing a certain transition 

FSM can be represented using a state diagram (or state transition 
diagram) as in figure 1. Besides this, several state transition table 
types are used. The most common representation is shown below: the 
combination of current state (B) and condition (Y) shows the next 
state (C). The complete actions information can be added only using 
footnotes. An FSM definition including the full actions information is 
possible using state tables (see also VFSM). 



state 



transition contfttion 




open-door 



Fig.l Finite State Machine 



State transition table 



Current State/ 
Condition 


State A 


State B 


State C 


Condition X 








Condition Y 




State C 




Condition Z 









In addition to their use in modeling reactive systems presented here, finite state automata are significant in many 
different areas, including linguistics, computer science, philosophy, biology, mathematics, and logic. A complete 
survey of their applications is impossible here. Finite state machines are one type of the automata studied in 
automata theory and the theory of computation. In computer science, finite state machines are widely used in 
modelling of application behaviour, design of hardware digital systems, software engineering, compilers, study of 
computation and languages. 
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A growing class of communications silicon, the Network Processor, promises to 
revolutionize how networking vendors architect, develop, and support their products. 
Network Processors deliver dramatic improvements in time-to-market, product lifetime, 
and system capabilities. This paper examines the benefits of Network Processo^ ip ^f^J 
comparison to other networking silicon offerings. fol f*|l. %\ if! 1 

A Brief History of Network Product Design 

The design of networking products has undergone continuous evolution as the speed and 
functionality of local and wide-area networks have grown. In the early days of 
packet-based networking, networking devices (such as bridges and routers) were built 
with a combination of general purpose CPUs, discrete logic, and ASSPs (Application 
Specific Standard Products), including interface controllers and transceivers. The 
software-based nature of these devices was key to adapting to new protocol standards 
and the additional functionality required by networks, such as the early Internet. Although 
these designs were large, complex, and comparatively slow, they met the needs of these 
early networks (generally comprised of a few Ethernet or Token Ring connections and 
slow (56kbps) wide-area links). 

Over time, as network interface speeds and densities increased, the performance of 
general-purpose processors fell short of what was needed. This led network vendors to 
develop simpler, fixed-function devices (such as Layer 2 Ethernet switches) that could be 
built with ASICs (Application Specific Integrated Circuits). These devices traded-off the 
programmability of software-based designs for hardware-based speed. As ASIC 
technology progressed (and vendors invested heavily in hardware-oriented design 
teams), more and more functionality was incorporated into the hardware. This was 
enabled in part by protocol consolidation around IP and Ethernet as the dominant 
enterprise network technology, which reduced the need for product flexibility. 

The relative simplification of network products has allowed merchant silicon vendors to 
"commoditize" some networking segments through specific chipsets, such as Layer 2 
Ethernet "switch-on-a-chip" products. Some of these solutions offer significant 
functionality within a narrow range of applications, such as ATM switching or basic 
Ethernet/IP switching. However, network vendors seeking clear product differentiation 
still required long and risky internal ASIC development programs. 

Today's Network System Development Challenge 

7f's the software, stupid!" 

Vint Cerf, Senior VP for Internet Architecture and Technology MCI WorldCom, and "Father of the Internet" 
ComSec Seminar, January 1 999 

Today, the convergence of public voice and data networks is speeding up the pace of 
change in the communications industry. This is leading to increased time-to-market 
pressure and shorter product lifecycles — just when product development cycles are 
growing due to complex ASIC designs and associated software re-designs. 

Although IP is emerging as the dominant protocol, newly defined IP capabilities, such as 
Quality of Service (QoS) and Multiprotocol Label Switching (MPLS), require vendors to 
continually support new applications. In addition, the number of different interface types, 
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ranging from sub-T1 through OC-48 in the WAN space in 
addition to 1 0/1 00 and Gigabit Ethernet in the LAN space, is 
increasing rather than decreasing. 

As a result, networking products require the same 
programmability and flexibility that was available in the 
early CPU-based architectures in order to quickly adapt to 
emerging standards, while maintaining the performance 
gains achieved through ASICs. To accomplish this, a radically 
new approach is required. See Figure 1. 

Figure 1 Network Processors Are the New Approach 




® Network Processors: Universal 
m Programmability and Performance 



© Network Processors, emerging on the market today, deliver 

hardware-level performance to software programmable 
Q systems. This powerful combination offers a revolutionary 

approach to the design of communication systems. It allows 
@ systems designers to focus on higher-level services and 
® ensures longer product lifecycles, rather than simply 
yy| meeting the "speeds and feeds" of the moment. 

The power of true Network Processors is best examined in 
light of the seven attributes that are listed in Table 1 and 
described in the following sections. These attributes are 
derived from next-generation network requirements for 
programmability, performance, and openness. 



Table 1 Network Processor's Seven Key Attributes 



Attribute, > 


Benefit , j ? * ' 


Complete programmability 


Supports universal networking 
applications 


A simple programming model 


Leads to faster time-to-market 


Maximum system flexibility 


Enables longer time-iQ-market™ 


Massive processing power 


Provides scalable performance 


High functional integration 


Lowers total system costs 


Open programming interfaces 


Delivers higher availability 


Third-party support 


Encourages continuous innovation in 
the industry 



Complete Programmability 

For real platform leverage, a Network Processor must be 
universally applicable across a wide range of interfaces, 
protocols, and product types. This requires programmability 
at all levels of the protocol stack, from Layer 2 through Layer 
7. Protocol support must include packets, cells, and data 
streams (separately or in combination) across various 
interfaces to meet the requirements of carrier edge devices, 
for example, that are the cornerstone of the emerging 
multiservice carrier network. See Figure 2. 

Figure 2 Universal Switch-Router Line Cards Based on Network 
Processor 




Interface Cards 



This type of multiprotocol solution offers important 
time-to-market competitive advantages, and dramatically 
reduces support costs for both the network vendor and 
service provider. 
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Simple Programming Model 

The programmability of a Network Processor must be 
readily accessible to the developer in order to be useful. By 
far the most common software languages in real-time 
communications systems are C and C++, with millions of 
skilled programmers and many more lines of existing code. 

Programming in the C and C++ languages also enhances 
the future portability of the code-base, enabling use in 
future generations of Network Processors and industry 
standard programming interfaces. This is not possible with 
specialized languages or state-machine codes. 

B Maximum System Flexibility 

C$ True Network Processors integrate all the functions 
C implemented between the physical interfaces and the 

switching fabric, enabling an open approach for the PHY 
& and fabric levels. This permits best-of-breed, multi-vendor 
© solutions that allow vendors to offer true product 
s q differentiation and scalability. In addition, software 
«| implementation of these functions allows simpler upgrade 

paths in this constantly changing networking world. 

H Massive Processing Power 

^| The architecture of the Network Processor needs to be more 

s«=a than the amalgamation of a few RISC core processors and 

I" some packet processing state machines. A fully optimized 

0 processing architecture, with a high MIPs (millions of 

HI instructions per second) to Gbps (Gigabits per second) ratio 

£j is required to support wire-speed operation at high 

bandwidths and still have processing headroom for 

ffl advanced applications. 
Q 

High Functional Integration 

© Network Processors need to provide a high level of system 
© integration that dramatically reduces part count and system 
complexity, while simultaneously improving performance, 
as compared to using a design that incorporates multiple 
components (such as ASSPs). 

In addition, a highly integrated Network Processor avoids 
the interconnection bottlenecks common with component 
oriented designs. Integrated coprocessor engines (such as 
for classification or queuing) can be fully utilized by internal 
processing units without interconnection penalties. 

Integration of lower layer functions (such as SONET framers) 
within the chip also enables higher port densities and lower 
costs than have typically been possible in the past. 

Figure 3 and Figure 4 provide a comparison of a multiple 
component system versus a highly-integrated system. 



Figure 3 Typical Interworking Design Using ASSPs and CPU 




Figure 4 Interworking Design Using a Highly Integrated Network 
Processor 




Stable Programming Interfaces 

A communication processor cannot deliver on software 
flexibility and portability if the programming interfaces are 
dependent on the processor. The processor's architecture 
must support generic "Communications Programming 
Interfaces" to simplify the programming task and allow 
future software reuse across generations of the processor. 

By delivering software stability across product generations, 
Network Processors radically improve software 
development cycles and reliability. Software reliability is the 
largest factor in total system availability. 

Third-Party Support 

To realize the full potential of a software-driven 
environment, the Network Processor needs to be the 
foundation of a complete communications platform that 
takes advantage of industry-wide hardware extensions, 
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software applications, and tool suites. This is only possible 
with an architecture that has the flexibility to support 
virtually any third-party protocol stack, any PHY or fabric 
interface, and links with industry standard tools. Such broad 
support significantly decreases time-to-market 

Network System Design Alternatives 

Of course, before the Network Processor there were a 
number of design alternatives that in their own ways 
provided some assistance in building better networking 
products. From using completely hard-wired solutions to 
configurable processors, and more recently, network 
B processor chipsets, networking vendors have incrementally 

Q improved and evolved their designs, but not without major 

£ compromises. 

£ Custom ASIC Designs 

Q Until recently, common practice of high-speed networking 
** design has involved the development of custom ASICs for 
critical elements of the architecture. This approach has been 
dictated by the requirement for "wire-speed" performance 
at reasonable cost. 

Most vendors have had limited success in leveraging an 
ASIC or ASIC family into multiple product lines, preventing 
them from amortizing the development costs across a 
broad range of revenue-generating products. 
Implementing product architectures in ASICs is a high-cost 
proposition from a number of perspectives: 



e3 



® 



• The design cycle is typically 1 8 months (and can extend 
beyond 3 years). Projecting market requirements that far 
in advance is difficult given the competitive dynamics of 
the market, resulting in the same company needing to 
place "multiple bets" to assure market success. 

• The risk of design failures in ASIC-based development is 
large, given the many months that are often required to 
correct design flaws (due to the lack of flexibility present 
in hardware-based designs). 

• The limited flexibility of hardware-based designs 
severely limits the ability to adjust product functionality 
to evolving market demands before and after market 
introduction. The result is shorter product lifecycles and 
greater to-market risks. 

• ASIC design expertise is a rare commodity. The ability to 
hire and retain talented designers has become a 
fundamental limit to the rate of product development 
for many vendors. 



• The design tools for complex ASICs can run in the 
millions of dollars (ASIC emulators, for instance) and 
require constant refresh as the technology advances. 

• Perhaps the largest, and often hidden cost, is the need to 
re-architect and re-write critical software associated with 
each product generation. Frequently, the extent of the 
re-write is unforeseen and prescribed by the need to 
optimize the designs around the hardware and ASIC 
technology, rather than around software re-use. Thus 
regardless of how much key functionality is embedded 
in the hardware, a massive amount of "slow-path" 
software is generally still required. 

Additionally, the large opportunity cost of software 
re-writes often prevents vendors from delivering the 
value-added services and applications that provide true 
market differentiation. 

While there remains a set of products that will require the 
customization available from ASICs, most vendors are eager 
to move to design alternatives that will improve their 
time-to-market and reduce development risks. 

Customizable ASICs and Configurable Processor 
Designs 

Several new design technologies are emerging to address 
some of the issues and risks of ASIC-based designs. These 
include: 

• Integrated Circuits (ICs) incorporating fixed-function 
network logic blocks with configurable interconnects 
(sometimes termed "systems-on-a-chip") 

• Configurable processor cores with changeable 
instruction sets that allow limited modifications to 
accomplish some network-specific tasks 

Configurable "Systems-on-a-Chip" 

Configurable "system-on-a-chip" approaches mix a number 
of fixed-function blocks, perhaps including a CPU-core, on a 
single chip with FPGA-like configurable interconnects. 
These devices speed-up the development cycle by enabling 
designers to choose from a "menu" of available functions 
that they assemble to build the desired part. Such devices 
sometimes promise future field re-configurability of the 
interconnection between different elements. 

While this approach offers some time-to-market 
advantages compared to traditional ASICs, having a 
collection of fixed-function blocks limits the flexibility to 
adapt to new features and standards because the design 
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remains essentially in hardware. In addition, having a single 
software-capable CPU element limits both performance 
and programmability. 

Configurable Processor Cores 

Configurable processor cores embedded in an ASIC design 
allow customization of the instruction set. In networking 
applications, this may allow the designer to create 
specialized instructions for certain communications tasks, 
such as the implementation of specific software encryption 
algorithms. However, such architectures assume that the 
software is handling the data and thus the processor must 
be inserted in the primary data path. This approach does 
A not scale and has been the main reason discrete 
g CPU-oriented systems have failed to keep pace in the past. 
mm Moreover, this approach also does not address the 
m fundamental bottleneck in "soft" architectures — 
q separating the control path from the data path in an 
4P effective and scalable way. 

y 

■eg Both the configurable "systems-on-a-chip" and 

*qj configurable processor cores are variations on the custom 

gj chip theme, and suffer in varying degrees the limitations 

Q described above for ASIC-based approaches. In particular, 

y these methods do not address maximizing software re-use 

*|5 as the key to higher reliability, faster time-to-market, 

increased product lifespan, and the delivery of expanded 

® network services. 

m 

Application-Specific Standard Products 

Ji» Most communications system designs, whether based on 
IB CPU or ASIC architectures, make use of some specialized 
y components. These are often used where a specific function 
^ is difficult to build into an ASIC, is available in a low cost 
ijj off-the-shelf component, or is not central to the system 
* design. An example of an ASSP-based design is shown in 
Us Figure 3. Some silicon vendors have continued to make 
standalone ASSPs attractive by supplying increasing 
functional density, such as has occurred with low-speed 
framers and physical interfaces (transceivers). 

Smart MACs 

Communications IC vendors have also been working to 
make some components smarter, integrating more and 
more of the functions that would normally be handled by a 
CPU and software (or in hardware with a custom ASIC) into 
the component. 

For example, some makers of Ethernet MAC ICs have begun 
incorporating some protocol data parsing and processing 
functions within the interface, alleviating some of the 



lower-level tasks from the system software. This can be 
beneficial for certain classes of products (such as network 
interface cards, for example), but is really only a small 
incremental improvement over traditional design 
methodologies for mainstream communication systems. 

Single-Function Components 

A different approach has been taken by other vendors who 
have set out to design optimized components addressing a 
single, higher-level function within the system. Examples 
include IP address lookup engines and traffic classifiers. 
These components reduce the number of functions the 
system designer must implement in custom ASICs and 
subsequently reduce time-to-market Some of these 
components also represent the state-of-art for a particular 
function, increasing the capability of the system solution. 

However, systems based on these devices still suffer from 
the limitations of a hardware-oriented approach. The 
configurability provided in the components is usually only 
enough to support the specific design, but not enough to 
adapt to emerging customer requirements or standards. 
Higher-level services, which must be implemented in 
software, are also limited (if not prevented) by this 
approach. 

Perhaps the biggest obstacle to single-function 
components is the level of effort required to effectively 
integrate various components into a complete system. For 
the higher bandwidth interfaces (like Gigabit Ethernet, 
OC-12,and OC-48),the interconnect design between 
components is often the primary system bottleneck. 
Multiple components lead to more complex hardware 
designs, less scalability, and increased time-to-market. 

Programmable Communications Components 

There are classes of communications-focused ICs with 
programmability similar to Network Processors. No matter 
how programmable a specific component may be, however, 
it is still limited by the overall system design issues of 
integrating various independent components. See Figure 5. 

Digital Signal Processors 

Digital Signal Processors (DSPs) offer a great deal of 
flexibility in the implementation of signal processing 
algorithms for a wide range of physical layer applications 
such as high-speed modems. With custom instruction sets 
for fixed and floating point arithmetic, DSPs are optimized 
for the mathematically intensive algorithms used in 
advanced signal processing. 
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Some vendors have proposed using DSPs (or multiple DSP 
cores on a single chip) to expand beyond pure signal 
processing into higher-level protocol handling. While some 
protocol processing may be supported within DSP 
architectures, the basic tasks of data formatting, parsing, 
classification, modification, and switching are 
fundamentally different from the mathematically oriented 
tasks of DSPs. 

The tools for programming DSPs are also oriented toward 
algorithmic implementations and require specialized 
language support. So, while DSPs are a great example of the 
power of programmability within communications systems, 
they are not an adequate universal processing solution for 
the higher-level protocol processing functions. 

Configurable State Machine Engines 

Another approach for achieving flexibility at the component 
level is the application of configurable state machine 
engines for off-loading some of the protocol processing 
from general purpose CPUs. 

These devices have sometimes been classified as "network 
processors" although they do not execute any "software" in 
the traditional sense. Instead, they have a series of 
configurable state machines that perform some of the 
framing, data parsing, and classification functions. Based on 
the configuration, these devices may pre-process ATM, 
Frame Relay, or Ethernet formatted data for a 
general-purpose CPU, for additional components (such as a 
MAC, classification engine, or custom ASIC), or for both. 

Figure 5 shows a product design using a configurable state 
machine engine. 

Figure 5 Typical State Machine Engine-Based Design 




The state machines are configured through CPU-accessible 
registers or external devices (such as FPGAs). Because the 
configuration of state machines can be quite complex, 
some vendors implement the required functions using 
specialized procedural languages to generate the actual 
state machine code; while other vendors provide a suite of 
pre-configured codes for a variety of tanned' applications. 

Although these state-machine-oriented devices offer more 
flexibility than typical fixed-function ASSPs, they suffer from 
the same architectural limitations. The design must still 
revolve around a general-purpose CPU or a custom ASIC in 
the switching path, with the requisite performance, 
flexibility, and time-to-market trade-offs. 

Programmable Special Purpose Devices 

Some communication component vendors have focused on 
increasing the programmability of their single-function 
components in order to provide better future adaptability 
and to broaden the market appeal of their devices. 

An example is segmentation and reassembly (SAR) devices, 
designed specifically to perform the interworking between 
frame-based (Ethernet and IP) networks and ATM-based 
networks. SAR architectures typically consist of Utopia 
interfaces, frame and cell parsing logic, dedicated 
scheduling and queuing support, and a custom processor 
for implementing the interworking protocols. A software- 
oriented processor is attractive in SAR components due to 
the rapidly evolving ATM interworking standards. 

Vendors of other components, such as HDLC controllers, are 
also allowing the "extra" processing cycles within their 
devices to be used for customer-defined applications. 

There are many difficulties when applying these devices 
beyond their originally intended purpose (SARing, HDLC 
multiplexing, and so on). For example, it can be difficult to 
determine exactly how many "extra" cycles are really 
available for custom processing. Further, The internal 
processors themselves are typically proprietary CPUs, 
specifically designed for one function. This means 
questionable suitability to more general processing tasks, 
often surprisingly large impacts on system performance, 
and the possible immaturity of the programming tools. 
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Pattern Matching Processors 

Another approach at providing programmability is a further 
extension of the single-function component. Examples of 
this include "pattern matching processors" that focus on 
providing configurable classification engines (see Figure 6). 

Figure 6 Typical Pattern Processor Design (OC-1 2 WAN Interface) 




Pattern matching processors provide more flexibility and 
configurability than the fixed-function devices described 
above, even allowing support for multiple protocol types 
(ATM, IP, and so on). The value of these devices is in 
embedded algorithms specifically useful for classification, 
which are sometimes configurable through a proprietary 
programming language. 

Aside from the obvious issues with proprietary languages, it 
is often difficult to evaluate the performance of these 
processors within an overall system design, due to the 
performance links between the classification functions and 
the switching and routing functions that must be 
implemented elsewhere in the design. 

Network Processor Chip Sets 

One of the fastest growing areas of merchant 
communications silicon is in the area of switching chipsets. 
Many ATM switching platforms are based on standard 
silicon, as are most low-end Ethernet workgroup switches. 

For the low-end systems, the obvious benefits are the ability 
to develop commodity-oriented products quickly and at 
very low cost While at the high-end, systems are often built 
with a mixture of the standard components that make up 
the chipset and custom designs (usually ASICs) that provide 
vendor differentiation. 



Among the high-end switching chipsets are the early 
"network processors" offering a complete fabric and packet 
processing solution for systems ranging up to multi-gigabit 
performance. Some of these architectures support both 
packet and cell-oriented systems, though not at the same 
time. Additionally, there may be fixed, limited interfaces 
within the architecture that enable networking vendors to 
program a small amount of functionality, or pass data to an 
external device (usually of custom design). 

A key drawback of the "switch-on-a-chip" and network 
processor designs is the limited flexibility for providing 
system differentiation or additional services beyond those 
envisioned by the original silicon architects. Invariably, the 
instruction sets provided in the architecture are proprietary, 
primitive, and have limited tool support. It is not 
uncommon for designers to discover the performance and 
functional limitations of these interfaces late in the design 
cycle, forcing time-to-market delays and critical functional 
trade-offs. 

Most of these solutions also use proprietary interconnects 
between the various chipset components, from port 
processing through switching fabric Not only does this limit 
the ability of the networking vendor to choose 
"best-in-class" solutions, but the silicon architecture tends 
to "blur the lines" between functions. The end result is 
limited scalability of the final product, preventing future 
growth and adaptability of the product line. Such an "all or 
nothing" approach to system design can often be difficult 
for networking vendors to accept for strategic product lines. 

For commodity-oriented communication products, 
complete "switch-on-chip" solutions can be viable 
time-to-market approaches. However, for higher-end 
products that must live in a complex and evolving 
application environment, an open approach (from both 
hardware and software perspectives) is required. 
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Table 2 Comparison of Network System Design Approaches 
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- Programming ; 
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Custom ASICs 
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Configurable Processors 


Configurable SOC 
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Configurable Processor Cores 
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Application-Specific Standard Products (ASSPs) 



Smart MACs 
















Single Function Components 
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Prog rammable Communications Components 


DSPs 
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State Machine Engines 
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Special Purpose Devices 
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Pattern Matching Processors 


+ 
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Switching Chipsets 


L2 chipsets 










+ + 






Network Processor chip sets 


+ 
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++ is excellent; + is good; - is fair; and - is poor 



Summary 

As you can see, Network Processors offer a revolutionary 
way of developing networking products that deliver 
dramatic improvements in time-to-market and 
time-in-market™. Table 2 provides a comparison of Network 
Processors to the alternatives discussed in this paper. 

Only true Network Processors, like the C-Port C-5 Digital 
Network Processor (DCP), offer all of these significant 
benefits: 

• Complete programmability — At all protocol layers 
from Layer 2 through 7, enabling adaptability to a wide 
range of requirements at any point in the network 
hierarchy. 

• Simple programming model — Leveraging well 
known programming methods and languages (C and 
C++) to allow faster time-to-market and portability of 
code across platforms. 

• . Maximum system flexibility — Maintaining a "soft" 
approach to enable new services and standards to be 
deployed with software-only upgrades. 

• Massive processing power — Required to fully and 
robustly implement the key networking functions and 
new services, and deliver wire-speed operation at high 
bandwidths. 



C-5, C-Port, C-Ware CPI, C-Ware Partner, the C-Port logo, and time-in-market are all 
trademarks of C-Port Corporation. 
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• High functional integration — Implementing all the 
network functions in a single chip solution to lowertotal 
system costs. 

• Stable programming interfaces — Simultaneously 
simplifying programming tasks and maximizing 
software reuse for future product generations. 

• Third-Party support — Leveraging the proven 
solutions of industry leaders in the software and 
hardware development community for faster 
time-to-market and better reliability. 

Because of these advantages, networking vendors can 
leverage the power and flexibility of software to apply a 
platform approach to system development, and focus more 
R&D resources on delivering the functions and services 
demanded in today's highly competitive market. 




C-PORT. 

A Motorola Company 

C-Port Corporation 
One High Street 
North Andover, MA 01845 
978-773-2300 TEL 
978-773-2301 FAX 

www. cportcorp .com 
www.mot-sps.com 



For More Information On This Product, 
Go to: www.freescale.com 



( 



1 



NP evolution & functionality 




Performance 



General Purpose Processor (GPP) : 

♦ Advantages * ( ~ "* * . 

• Quick time-to- market " 

• Core performs all routing functionalities . 

■ Flexible to upgrade the system > ' ' 
♦Disadvantages *■ • " 

■ Not easy to scale up the system - ; 

/. - Major reduction in performmicc.fo£ complex „ 
. .operations (traffic management. QoS) 



ASIC based 



bupporis mi 
Uses pipcliii. 



fiance & fiexmfiity 
t reads (mult rule PEsj 



i the sysiem ; • i 
jerforntance for complex . 
rianagcrnetit, QoS) ■ ' 



■ Uire-sp-oeu pcrtormu.v.cc. „ . . .- 

* Intcgnitcd^vitli adainona! GPP. RISC v 
processors io sc.iK<ruic control vs. daia 

3v*: ptoses ! /* .' 

■ Lfieirs ficjritnlity ; ' < 

• Long design eyelet high iirnc-io-rnjr! 
1 Change in design/ taOurc ~> high risk 

" Complex -ope rations sdit done in'SW 



Generation i 



RbC-ha^d 

• Numerous instructions -> more time 



• Tiring parallel -> increasing sys complexity and chip size 

Generation TT Augmented RTSC-bascd ■ - 

• RISC + ASIC— > accelerate hardware 

: ' ^ • Less flexible . ' ; . V I . " : 



•Generation IIP Netwoft Specific processor -'^v./V; 
1 % ^ ; " ; ; L / r ^Majny 



Key characteristics of the architecture 



Programmable . . ■' ■ . ' ' ; ' s * / - . - 

The essence of a IMP is that it is programmable. High performance -is maintained by * 

v implementing^ the NPs with their own optimized instruction set for the task of * 
processing network packets. ■ : - "~ 

Modular ■ ' . • . _ . •.. . 

The architectures employ different schemes and levels, of modularity. * 
To obtain high performance and scalability - components of the (MP and connecting * v 
technology must be programmable to perform ? the work they 'are best : suited for. 



" Most of the architectures discussed allow network vendors to build small^ inexpensive " 
to large, high-performance network devices. « . ■* * . , 

NP vendors have some kind of switching fabric Jhat takes the place of the central bus 
and offers a full crossbar; switching architecture. .; .* 

.Parallel , . ,] " ■ ' \ - ■ ' 

.Many, of the' architectures have micro-engines that can be programmed and can <■ 
intelligently ^process data at wire speeds. They accomplish high throughput by? * 
supporting multiple thread on each engine (Multithreading). Other designs employ 
super pipeline and superscalar architectures for massive processing power. ') "V- 



Key characteristics of the architecture 

Integrated bus fc • 

Typical NP architecture has some type of integrated bus. This bus integrates the 
processor cores, the memory systems, interfaces to the physical adapters and the 
host system bus. This integration reduces partxount and system complexity, 
while improving performance.. 

High-bandwidth memory ' - 
High-bandwidth, low latency memory is essential for the network processors to 
achieve the necessary speed required. The architectures either embeds the 
needed memory on the chip or employs scratchpad memory, high register counts 
and a sophisticated interface to external memory than reduces contention. 

Wire -speed inteU^ _ . ■> ■ j 

To support features such as QoS algorithms are being developed to classify, mark 
and regulate packet flow. » * * . ; ■ 

The hardware offers the opportunity to perform these operations at f wire speeds 
with well-designed and well-implemented algorithms. ' ; : ; ■> ■'■ 

TahleT lookup algorithm's. '■ \. ' , . { *..-r .-- ' ' 
, Some designs have specialjtable lookup'units thatian handle multiple lookup i 
-algorithms simultaneously at very high speeds. , f J.-' ' '% ?.(Z m .. . ■. , 



Application categorization 




Data-plane operations - examples 



o Priority based QoS mechanism 

• Supports different levels of QoS for each output port 

• Contains QoS policy table — prioritizing packets , 

• Ingress operations > ' » 

a Applies QoS policy on the packet received * 

o Gets : the packet priority from its heather content 

« Place the packet in the appropriate output queue 

• ' Egress operarioi is ' . ; : . 

o Identifies & "schedules highest priority packet for ". 
; • transmission • .• \ • . ■ \ ? : ' > 

^ a Transmits the :identified packet on to the output port 
« Security ; " ; " . ' 

• ; Encryption/ Decryption, intrusion detection/ access con trol : 
.* checking, den ial'Tof-service- : ; • . ' , > 

b .Monitoring - 1 ,' ; • ■ \ ',' ■ 

■ 4 •." Capturing usage; patterns; time information^ - ' "/ I 
b Load Balancing ; :• >-• , t .. !■■}■*" - }' : r , 

• Distribution of traffic ambng'.servers' accord ingtq4he. server V 
' I ' : \. *°. ac ! » content : an d cl i eht creden tial s ; ■ v ' ■ : V 1 ;. 5 '- /-^ ' : ' - "y 




NP basic architecture & packet processing flow 

; • Controls data flows across network . i . ; 

to'optimize network resources * * / ' " ■ 

: - providing bandwidth 8Tdefay<guarantees ■ 

• Packer Segmentation into fields * 

• ClEissification^- ihspectioh of packet 
properti es i n~ order to d etermine 7 

' the best way to. pro cess ;it * 




. ♦ The pP handles'tnitial setup and 
exception ^processing . ■ ? 



• Packet header modification ■ 
needed .to process and route 
to destination . 
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7-layer processing WHY? 



Differentiation of networking equipment enables the vendor to provide better service to 
the customer supporting applications such as: . , : ; 

«> Policy-based networking ' *- ''■ 

Enabling fine-grained service provisioning in accordance With flexible policy rules 
Including bandwidth allocation, priority definition, security enforcement route selection . 

« Server load balancing 7 . * * : 

Traffic distribution .among- servers in accordance with web destination ' 1 

URL, client credentials (cookies),, server load " * : - 

, , • i ■ • ' ••' " :< - . . ' ' - • 

« NAT (Network Address Translation): layer 3 - 'IP addresses, layer .4' - port ;, 
numbers, layers 5-1 - address- instantiations ' '.. . .V . • 

' / Forwa rdi ng frames : layer- 2 - MAC -ad dresses, layer 3 r IP ! add resses -\ v. '' : 
•: ' ' • Target server selection: layer 5-7 - URL and cookie, extraction : , ; ...V- : 

' Server arid -client message 1 handli ng : layer -5-7 cookie modification i ' . '< ' ; 



( 



7-layer: processing WHY? 



o Usage- based accounting ** ; " 

Provision of. detailed per-ffqw, network bandwidth -usage for binino/& capacity planning 

Example ' tracking the download and bandwidth usage to specific client when downloading' 
v a video clip ' ■ . - ' * , • 

Recognize session initiation for specific server - layer 3 IP ; address and layer 4 
• port number r .: ,; - . ■ ■ . ■ _ .. 1 _ ; ' 

" Monitor login session to'identify the" user name - layer 5-7 extraction of login info 

' Identify desired file name to download - layer 5-7 extraction of file name and 

■ ■ matching to program policy tables ' ' v 

• MPLS traffic engineering < 

Delivery of highly-granular QoS required for time sensitive applications/ ' 
Voice a video over IP) real-time business transactions .{ / " 

. % i j " ■ t ■■ . . ■ ■ ■/ " ■ 

• Metwork monitoring Sc analysis j ' . , V 1 1 * ' ■ ; 

Intelligent network management/ alert notification and troubleshooting ' \ \ . y . 



The memory challenge 



The challenge t . 

• As bandwidth requirement grow ~> network speed increase = 
memory becomes bottleneck 

• 7-layer processing (deep packet processing) places heavy 
- strain on memory usage (reading 

and writing significantly more data to / from memory " mm 

• Packets storage and lookup table searches I 



Packet buffer memory = > DDR DRAM / SRAM / RAMBUS 
This buffer memory is accessed at least 4 times per packet 
» Writing the packet when received from the network 

• Reading frame data for processing (lookup tables) 

< • Writing modified packet for transmission . 

* Reading the packet for transmission ..' 
.Therefore, in order to sustain wire-speed performance 

, the P5M should provide at least 4 times bandwidth of. y.** 
the network link, i.e., 40;Gbps : 

Lookup tables memory => GAM - .'; . 

During classification headers and data Fields of an incoming 
packet are used for Searching in various lookup tables v 
' containing :QoS, control, poljcy. information, IP addresses 
.Therefore/iit requires simultaneously table access ?■ - 
" To separate -memory cores. Wh*en considering ? ^ . 
'. Hull duplex data transfer ^ve. get 160 Gbps memory : 
. Bandwidth!!' - :'</"•? ' ' . v •' •••• 
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Quality of service (QoS) 



"A mechanism for networks to satisfy the varied ■ 
quality and grade of service' required' by application ■ " 
' while at the same tirne maximizing bandwidth utilization" 

• Applications are classified into groups and these groups 
are carried across the network over;a finite set of QoS c 
classes ■» ■ *; '■ 

• Applications such as interactive video and voice are 
delay and loss sensitive => supported by constant bit 
rate (CBR) service j . ' 

» Applications NOT delay/loss sensitive •==> variable; bit 
rate'.(VBR) service . 

• Example of QoS classes 

• ATM classes - six QoS classes offering loss, delay, 
and jitter guarantees 

• IP DiffServ - .uses the ToS IP field to differentiate 
*■ packets. Three main classes:' Expedited r ■ 

* Forwarding ,(EF) low loss/delay ; Assured . v 
Forwarding (AS) not as stringent as EF ; Best; \ 

', effort packet switching ^ ! ■ * ! 

* MPLS, labels - packets are encapsulated and - '* . 
assigned labels which provide service differentiation 




Applications less vs.- delay requirements 



QoS ' mech ani sms 






'NP architeci^fal approaches 



INTEL IXP1200 



StrohgARM 32-bit 
RISC processor 



^ SRAM 8MB used to- 
• store LUT and micro 
,. engines shared data 



FBI unit provides interface ■ 
to the packet dataflow 




SDRAM 256MB used as 
Storage of massive data 
Such as packet buffers 



6 micro engines'(32-bit) 
RISC, each include: 
4KB instruction cache . 
256 32-bit registers ; ' , 
Supports context switching 



NP architectural approaches 



MOTOROLA C-5~ 



Fabric processor connecting 
to external switch fabric 
assists in passing data 
To/from the fabric > - 



RISC processor to provide general supervising 
management responsible for initialization, 
exception handling, statistics gathering & host . 
computer interfacing 



Manages 512 queues 
used after scheduling 
to forward the packets 
to its output port 
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Interfaces the externa 
SDRAM for packet - 
payload storage 



■ 16 Channel Processors (PE) \ \\ 
RISC core;. Perform iclassifi cation/: - 
and scheduling decisions 



NP architectural approaches; 




NP architectural approaches 

A^erc PavloadPlus - -i- — - . . ^ , . . _ 




fe^Pawerv. Procter p£p) ' - ! 1 

Pipelined Multithreaded proc. . 
Simultaneous 64 packet classification - 

Output result sent to RSP ■ \ 



Fast Path operations 




Slow Path operattons 



/Routing Switch Processor (RSP) 

* 'VUW processors that can run' 
•Multiple programs si multaneously, 

- Performs queuing, traffic: 

~ Management shaping and packet 

^modifications; ' • r. ' ^ i; 



NP arcHitectural' approaches 5 



EZchipNP-2 : 

10 Gb wire speed 7-la/yer*NP- 
Eliminates .the need for CAMs. 
. Supports deep-packet . 
processing: layers 2-7 
Highly programmable to 
achieve fast time to market 




Protocol independent search engines 
Multiple search engines operating in 
Para I! si with pipelining of memory ' 
Cycles and intelligent.partitioning to 
Overcome the DRAM latency barrier 



Packet 

Fields 
..Tags, , 

Addresses, 

Protocols, 
^patterns ■■ 



Lookup 
Cfassifieatic 



Forwarding 
Decisions, 
"QoS, updates 



Packet 
Modifications 




Lookup engine wiitivrriinimal instruction 
set recombines data into keys 
Performs chained lookups (a key from 
one packet can be used to Search 
an other; packet) ; * 7 ; 
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